Automatic image annotation (also known as
automatic image tagging or
linguistic indexing) is the process by which a computer system automatically assigns
metadata in the form of
Photo caption or
Index term to a
digital image. This application of
computer vision techniques is used in
image retrieval systems to organize and locate images of interest from a
database.
This method can be regarded as a type of multi-class image classification with a very large number of classes - as large as the vocabulary size. Typically, image analysis in the form of extracted and the training annotation words are used by machine learning techniques to attempt to automatically apply annotations to new images. The first methods learned the correlations between image features and training annotations. Subsequently, techniques were developed using machine translation to attempt to translate the textual vocabulary into the 'visual vocabulary,' represented by clustered regions known as blobs. Subsequent work has included classification approaches, relevance models, and other related methods.
The advantages of automatic image annotation versus content-based image retrieval (CBIR) are that queries can be more naturally specified by the user. At present, Content-Based Image Retrieval (CBIR) generally requires users to search by image concepts such as color and texture or by finding example queries. However, certain image features in example images may override the concept that the user is truly focusing on. Traditional methods of image retrieval, such as those used by libraries, have relied on manually annotated images, which is expensive and time-consuming, especially given the large and constantly growing image databases in existence.
See also
-
Content-based image retrieval
-
Object categorization from image search
-
Object detection
-
Outline of object recognition
Further reading
-
Annotation as machine translation
-
Automatic linguistic indexing of pictures
-
Hierarchical Aspect Cluster Model
-
Latent Dirichlet Allocation model
-
Supervised multiclass labeling
-
Ensemble of Decision Trees and Random Subwindows
-
Relevance models using continuous probability density functions
-
Multiple Bernoulli distribution
-
Multiple design alternatives
-
Relevant low-level global filters
-
Global image features and nonparametric density estimation
-
Image Annotation Refinement
-
Automatic Image Annotation by Ensemble of Visual Descriptors
-
A New Baseline for Image Annotation
Simultaneous Image Classification and Annotation
-
TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation
-
Image Annotation Using Metric Learning in Semantic Neighbourhoods
-
Automatic Image Annotation Using Deep Learning Representations
-
Holistic Image Annotation using Salient Regions and Background Image Information
-
Medical Image Annotation using bayesian networks and active learning